Introduction

The board game go is one of the grand challenges of game related artificial intelligence. While go is a finite, perfect information game, its game tree is prohibitively large for tree searches. The situation in go is perhaps best compared with that in chess, which has been called ``the drosophila of artificial intelligence'', meaning that a multitude of key problems of artificial intelligence appear in chess and can be studied there. Currently, the best chess playing programs rely heavily on large scale tree searches and use a comparatively small amount of chess knowledge. In go it is the other way round, i.e. the quality of the go knowledge is the determining factor. The development of strong go programs emphasizes much more the human way of thinking about go rather than brute force methods for which computers are ideal, and from this point of view one could argue that go is a much more interesting problem than chess. Many important advances have been made in go regarding the application of rule based knowledge representations, pattern recognition and machine learning. For a recent, lucid review of computer go see [1].

In this article we want to look at go from a completely different angle: How would nature play go? This is not the starting point of a metaphysical discussion. This is an attempt to learn from physics about a method of optimization which nature applies very successfully and ``naturally''. We will consider simulated annealing [2,3,4], which under surprisingly general circumstances is able to find approximations to the extrema of a mathematical function. Our topic will be the question of how to formulate the task of finding a good move as a problem in extremization which is tailored to the strengths of simulated annealing.

Perhaps it is worthwhile to describe the main idea right away in very simple terms to indicate the direction in which we are heading. There are essentially three ingredients in computer programs for playing games like go: (i) look-ahead where different sequences of moves are tried, (ii) evaluation where the value of a position is computed without explicitly performing moves, and (iii) an overall scheme for how (i) and (ii) are combined. Here is our proposal:

(i) moves are performed randomly with probabilities assigned by the method of simulated annealing,
(ii) the value of a position in which the game is over is defined by counting, and
(iii) to find the best move in a given position play the game to the very end as suggested by (i) and then evaluate as in (ii); play many such random games, and the best move will be the one which does best on average.

Let me quickly point out the obvious. The above idea is certainly simple, and I believe that any good idea can be stated simply. However, not every simple idea is a good one, and at first sight the above one may sound ridiculous. The main portion of this article is dedicated to convince the reader that at least this is not necessarily a bad idea. Furthermore, one may suspect that when considered in detail, (i), (ii), and (iii) are not has harmless as they seem. Let me just state that the actual implementation in the go program described below is still quite simple, but one can hope that the more sophisticated the design, the better the results.

The paper is organized as follows. First, we introduce some background of simulated annealing. Then we discuss how it can be applied to the problem at hand, in particular we discuss why our approach could work in principle. In the next section a computer implementation, Gobble, is presented together with some examples for what kind of strengths and weaknesses Gobble displays in games on the 9x9 board. We conclude with a general discussion.